Valid Statistical Analysis for Logistic Regression with Multiple Sources

نویسندگان

  • Stephen E. Fienberg
  • Yuval Nardi
  • Aleksandra B. Slavkovic
چکیده

Considerable effort has gone into understanding issues of privacy protection of individual information in single databases, and various solutions have been proposed depending on the nature of the data, the ways in which the database will be used and the precise nature of the privacy protection being offered. Once data are merged across sources, however, the nature of the problem becomes far more complex and a number of privacy issues arise for the linked individual files that go well beyond those that are considered with regard to the data within individual sources. In the paper, we propose an approach that gives full statistical analysis on the combined database without actually combining it. We focus mainly on logistic regression, but the method and tools described may be applied essentially to other statistical models as well.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An application of principal component analysis and logistic regression to facilitate production scheduling decision support system: an automotive industry case

Production planning and control (PPC) systems have to deal with rising complexity and dynamics. The complexity of planning tasks is due to some existing multiple variables and dynamic factors derived from uncertainties surrounding the PPC. Although literatures on exact scheduling algorithms, simulation approaches, and heuristic methods are extensive in production planning, they seem to be ineff...

متن کامل

Comparing Discriminant Analysis, Ecological Niche Factor Analysis and Logistic Regression Methods for Geographic Distribution Modelling of Eurotia ceratoides (L.) C. A. Mey

Eurotia ceratoides (L.) C. A. Mey is an important plant species in semi-arid landsin Iran. New approaches are required to determine the distribution of this plant species. Forthis reason, geographical distributions of Eurotia ceratoides were assessed using threedifferent models including: Multiple Discriminant Analysis (MDA), Ecological Niche FactorAnalysis (ENFA) and Logistic Regression (LR). ...

متن کامل

Achieving Both Valid and Secure Logistic Regression Analysis on Aggregated Data from Different Private Sources

Preserving the privacy of individual databases when carrying out statistical calculations has a relatively long history in statistics and had been the focus of much recent attention in machine learning. In this paper, we present a protocol for fitting a logistic regression when the data are held by separate parties—without actually combining information sources—by exploiting results from the li...

متن کامل

Bayesian and Iterative Maximum Likelihood Estimation of the Coefficients in Logistic Regression Analysis with Linked Data

This paper considers logistic regression analysis with linked data. It is shown that, in logistic regression analysis with linked data, a finite mixture of Bernoulli distributions can be used for modeling the response variables. We proposed an iterative maximum likelihood estimator for the regression coefficients that takes the matching probabilities into account. Next, the Bayesian counterpart...

متن کامل

Using Multiple-Variable Matching to Identify EFL Ecological Sources of Differential Item Functioning

Context is a vague notion with numerous building blocks making language test scores inferences quite convoluted. This study has made use of a model of item responding that has striven to theorize the contextual infrastructure of differential item functioning (DIF) research and help specify the sources of DIF. Two steps were taken in this research: first, to identify DIF by gender grouping via l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008